There’s been a lot of media attention to teacher salaries. This exploration covers various elements of how money is spent on education. One area of particular interest is spending in Oklahoma which has gotten a lot of attention in the news recently. Exploration covers overall trends in spending per student across the country, but then focuses is in Oklahoma vs California vs New Jersey to see how things compare.
Example youtube video of teachers in Oklahoma changing jobs.
There is a government census done of all public schools in the United States. This data for 2016 is published here:
2016 Public Elementary-Secondary Education Finance Data
Since none of the provided tables contained all the desired information created a cleaned dataset using the following information:
Tables attached to project assignment:
Cleaned data created has the following column headers:
After cleaning data, decided to only include school districts with at least 50 students enrolled.
Added a few more columns later in the exploration:
Note: Teacher compensation = Salary + benefits
## ENROLL TOTALREV TFEDREV
## Min. : 50.0 Min. : 234 Min. : 0
## 1st Qu.: 452.5 1st Qu.: 6388 1st Qu.: 347
## Median : 1163.0 Median : 16328 Median : 920
## Mean : 3724.4 Mean : 51223 Mean : 3961
## 3rd Qu.: 3059.5 3rd Qu.: 42748 3rd Qu.: 2590
## Max. :981667.0 Max. :27448356 Max. :1739101
## TSTREV TLOCREV TOTALEXP
## Min. : 0 Min. : 0 Min. : 301
## 1st Qu.: 2884 1st Qu.: 2206 1st Qu.: 6245
## Median : 7775 Median : 6105 Median : 15962
## Mean : 24103 Mean : 23159 Mean : 50922
## 3rd Qu.: 18936 3rd Qu.: 18391 3rd Qu.: 42114
## Max. :10568010 Max. :15141245 Max. :29620098
## TCURISAL TCURIBEN PPCSTOT PPSALWG
## Min. : 0 Min. : 0 Min. : 0 Min. : 0
## 1st Qu.: 2030 1st Qu.: 692 1st Qu.: 9455 1st Qu.: 5433
## Median : 5133 Median : 2058 Median : 11035 Median : 6347
## Mean : 16947 Mean : 6928 Mean : 12796 Mean : 7237
## 3rd Qu.: 13947 3rd Qu.: 5795 3rd Qu.: 14345 3rd Qu.: 8031
## Max. :10044302 Max. :6258743 Max. :374873 Max. :183063
## PPEMPBEN PPITOTAL PPISALWG PPIEMBEN
## Min. : 0 Min. : -1472 Min. : 0 Min. : 0
## 1st Qu.: 1827 1st Qu.: 5637 1st Qu.: 3641 1st Qu.: 1200
## Median : 2552 Median : 6566 Median : 4269 Median : 1669
## Mean : 2981 Mean : 7606 Mean : 4861 Mean : 1995
## 3rd Qu.: 3633 3rd Qu.: 8554 3rd Qu.: 5426 3rd Qu.: 2490
## Max. :76688 Max. :228102 Max. :131903 Max. :55750
The data varies a lot:
Histogram exploration
Investigating how large the school districts are (note each district may contain a different number of separate schools.) Also investigating what the per student spending looks like. Median is marked with red dashed line.
GGPAIRS matrix - Getting some basic correlation data
Plots of enrollment vs expenses
The relationship between enrollment and and total expenses looks very linear at the lower numbers, but as the enrollment goes up, the relationship scatters. Also even though the relationship is fairly linear there is quite a bit of variation and some outliers.
Note that at this point decided to look mainly at districts with less than 125,000 vs districts like NYC that has almost a million students or Hawaii where the whole state is one district.
Plots of enrollment vs total per student sending
Horizontal line on plots above is the median. It appears that the spending per student is at the lower enrollments hovers around the median, but there are quite a few schools where the spending is substantially higher. Starting to bring up questions:
Plots of enrollment vs revenue sources (Federal, State, Local)
Added a red dashed line for enrollment = 10,000 students to make it easier to compare the various sources of funding.
Looks like most money comes from Where the money comes from looks interesting. While state funding looks pretty linear. Looks like most money for education comes from state and local sources.
Based on the plots above there is a need to create additional data to get more specific plots.
## PPTISB PPPOI PLOCALREV
## Min. : 0 Min. :0.0000 Min. :0.0000
## 1st Qu.: 5004 1st Qu.:0.5002 1st Qu.:0.2696
## Median : 5882 Median :0.5420 Median :0.3997
## Mean : 6856 Mean :0.5371 Mean :0.4307
## 3rd Qu.: 7757 3rd Qu.:0.5803 3rd Qu.:0.5817
## Max. :187653 Max. :2.1406 Max. :0.9904
## NA's :2
The summary showed percentages over 1 for percent of total spent per pupil for salaries, wages benefits. Did some spot checks and found that there were only a handful of schools impacted. Brings up questions as to the reliability of the data, but since this is not for professional use, will not investigate further.
Plots Looking at percentage of local rev vs other variables
Had expected to maybe be able to see a trend, but did not see anything useful. Decided to look at a subset of data with a limited number of states.
Plots Looking at data for a subset of states
Starting to see some noticeable differences:
Plot Looking at just 1 state (New York) a little deeper
It does appear that total per student spending goes up when the percent of funding from local revenue for the school district goes up.
Narrowing down to instructional salary and benefits
Looking at a few more factors.
Narrowing down to just 3 states, NJ, CA, & OK
CA is way below NJ and not too far above OK although it’s pretty well known that it’s much more expensive to live in California. So what happens if a cost of living index is applied to the values?
Factoring in the cost of living for NJ, CA, & OK
Grabbed some data from 2017 (close enough) from: US Learning
To normalize the data will take the total spent on salaries and benefits per pupil divided by the index/100 just to get a more dollar based amount.
Based on the graphs above accounting for cost of living, it seems Oklahoma teachers get paid more than either New jersey or California.
Factoring in the class size for NJ, CA, & OK
In theory class size would have impact the actual salaries. If compensation is the per student money spent on instructional salaries and benefits than:
If the per student instructional compensation were the same.
(Note: This doesn’t examine quality of education based or difficulty of the teaching role as class size increases.)
From:
National Center for Education Statistics
This information is from 2012 so things may be a little old. Four years is a long time. But unless the states increased or decreased disproportionately to other states the results should be similar.
Decide to use the numbers provided in the following way:
(Average class size by level of instruction Elementary + Average class size by level of instruction Secondary)
divided by 2. Then normalize that using the average US class size
Results: Once you add in cost of living and class size there while there is a lot of variation across states, there doesn’t appear to be a big difference between California, New Jersey and Oklahoma in terms of Instructional Salaries and benefits.
Using the cleaned up data from:
2016 Public Elementary-Secondary Education Finance Data
to graph the per student spend on instructional salary and benefits for California, New Jersey and Oklahoma there is a clear indication that the per student spending in Oklahoma is far less than California or New Jersey for school districts between 1000 and 10,000 students.
It also appears that the variation of what is spent per student is far greater in California and New Jersey than in Oklahoma.
Using this graph alone:
Grabbed some data from 2017 (close enough) from: US Learning
Normalized for cost of living:
The schools switch places. California falls to the bottom. New Jersey ends up in the middle and it appears that Oklahoma teachers are the best paid teachers. Also the variation in spending per student on instructional salaries and benefits for Oklahoma appears larger once the cost of living index is applied since a dollar in Oklahoma goes farther than it does in California or New Jersey.
Because the data on plot 1 and plot 2 is NOT strictly reflective of how much a particular teacher gets paid in any of the states it is necessary to use a normalizer to try to compute what the relationships might look like.
Using class size information:
From:
National Center for Education Statistics
This information is from 2012 so things may be a little old. Four years is a long time
Decide to use the numbers provided in the following way:
Average class size for teachers in self-contained classes + Average class size for teachers in departmentalized instruction and divide by 2. Then normalize that using the average US class size.
and normalizing the data in plot 2 created the plot above.
Now the data from all three states starts to overlap indicating that maybe there is less of a difference on what is spent on teacher salary and benefits by state than what the original data showed alone.
Note that the final numbers on the y axis are not real numbers for any of the states, but an indication of spending relative to each other.
What were some of the struggles?
What went well?
What was surprising?
What further investigations could be done?